Animated Map Plot with ggplot & ggmap
2023-09-07
Animated Map Plot with ggpanimate
This article is a continuation of the cyclistic_case_study article, this time we will create an animated graph of the number of trips of casual cyclists at an end station using a combination of ggplot and gganimate.
gganimate is an extension of the ggplot2 package for creating animated ggplots. It provides a range of new functionality that can be added to the plot object in order to customize how it should change with time.
- Key features of gganimate:
- transitions : you want your data to change
- views : you want your viewpoint to change
- shadows : you want the animation to have memory
List of Packages & Libraries that used for this article
# pakcage
devtools::install_github('thomasp85/gganimate') # install gganimate package
# data manipulation & processing library
library (dplyr)
library (janitor)
library (lubridate)
library (DBI)
library (RMySQL)
library (tidyr)
# static visualize library
library (ggplot2)
library (ggdark)
library (ggmap)
library(scales)
# animatied vizualize library
library(gifski)
library (gganimate)A. LOAD DATA
data we will use is stored at database, below how to load it :
# create connection to database
mysqlconn <- dbConnect(MySQL(),
user = 'root', password = 'rootpass',
dbname = 'DB_TELKOM', host = 'localhost')
# load data in 2 batch
raw_data_2022 <- dbGetQuery(mysqlconn,"select ride_id, end_station_name,
end_lat, end_lng, ended_at, member_casual
from cyclistic_2022")
raw_data_2023 <- dbGetQuery(mysqlconn,"select ride_id, end_station_name,
end_lat, end_lng, ended_at, member_casual
from cyclistic_2023")
# disabled database connection
dbDisconnect(mysqlconn)
# bind data
raw_data <- bind_rows(raw_data_2022, raw_data_2023)
# Remove data & database connection
remove(raw_data_2023)
remove(raw_data_2022)
remove(mysqlconn)B. CLEANING, MANIPULATION & PROCESSING RAW_DATA
Following how to get properly data through some process :
# 1. Cleaning column name
raw_data <- clean_names(raw_data)
# 2. Removing empty/NA values
raw_data <- remove_empty(raw_data, which = c("rows", "cols"), cutoff = 1)
# 3.Removing NA Values
raw_data$end_station_name <- replace (raw_data$end_station_name, raw_data$end_station_name == "", NA)
raw_data <- raw_data %>% filter(!is.na(end_station_name))
# 4. Data Type Manipulation : Date handling
raw_data$ended_at <- ymd_hms (raw_data$ended_at) # convert date as date time (posixct type)
# 5. Create additional column ( as month_trip column)
raw_data <- raw_data %>% mutate(month_trip = as.Date (as.POSIXct(ended_at,format = "%Y-%m-%d")))
# 6. Sort data
raw_data <- arrange(raw_data, ended_at)C. GET ANALYSED DATA
The raw data we have cannot be analyzed yet, because we only need a few of certain columns such end_at, end_station_name, end_lat, end_lng, month_trip, member_casual, ride_id.
the data in the end_lng and end_lat columns have duplicates, so that it can reduce or even disrupt the information that we will visualize. here’s how to prepare data that is ready to be analyzed in this case:
# data_for_plot
# only plot for casual cyclist so we filtering out member
map_data <- raw_data %>%
select (ended_at,end_station_name, end_lat, end_lng, month_trip, member_casual,ride_id) %>%
filter (member_casual == 'casual') %>%
group_by(month_trip) %>%
mutate(numtrips = length(ride_id)) %>%
distinct(end_station_name, .keep_all = TRUE)
# create additional column for cummulative sum of num_trips
map_data_v2 <- map_data %>% select (month_trip, end_station_name, numtrips) %>%
group_by (end_station_name, month_trip) %>%
arrange(month_trip) %>%
summarize (sum_trip = sum(numtrips)) %>%
mutate (cumsum_trip = cumsum(sum_trip) ) %>%
arrange(end_station_name,month_trip)
# get Latitude and Longitude for each end_station_name
# data frame for join (right_side) removing duplicate name of end_station
right_data <- map_data[,c("end_station_name","end_lat","end_lng")] %>%
filter(!duplicated(end_station_name))
# for further needed, adding id_col
# map_data_v2$id_join <- paste(map_data_v2$end_station_name, map_data_v2$month_trip, sep = "/")
map_data_join <- left_join(x = map_data_v2, y = unique(right_data [,c("end_station_name","end_lat","end_lng")]),by = join_by(end_station_name == end_station_name )) %>% arrange(month_trip)D. GETTING STARTED WITH STATIC MAP PLOT
we have obtained data that is ready to be visualized (map_data_join), to do a map plot, we need to prepare a layer in the form of a map tile that we can get through the following this websites:
- Get Format Map Image / Map Tiles
web : openstreetmap link. this web used for completing our longitude, latitude code & zoom level, keep in mind, we should manualy adjust for getting ideal zoom on this web.
Here are the steps to use the website to get map tiles:
To get map tiles, first open the openstreetmap website, then follow the steps in the following image: